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RANDOMISED DNA LIBRARIES AND DOUBLE-STRANDED RJJA LIBRARIES, USE 
AND METHOD OF PRODUCTION THEREOF 

5 

TECHNICAL FIELD 

This invention relates to DNA libraries based on plasmid or viral vectors that can 
express double-stranded RNA of 10-30 base pairs in length with all possible se- 

10 quences, where each of the double stranded RNA is formed by a single RNA mole- 
cule in the form of hairpin, or formed by two separate RNA molecules with different 
3'-overhangs. Each single member in such a DNA library encodes all components of 
a double stranded RNA as specified above. Such a library can be used in screening 
for double stranded RNA species that can induce a given phenotype without prior 

15 knowledge of their target genes. This invention further relates to a method to gener- 
ate such a DNA library. 

BACKGROUND ART 

20 Messenger RNA {mRNA) is normally perceived as the information-carrying interme- 
diate in protein synthesis that is transcribed by RNA polymerase from a DNA tem- 
plate and subsequently translated by ribosome to generate protein molecules. Re- 
cently more data have demonstrated that many genes are transcribed into RNA 
molecules that are not translated into proteins at all (Gkazaki Y et. AL, Nature; 

25 420(69 15): 563-573 (2002)). Some of the untranslated RNA were found to carry out 
functions in the regulation of the other mRNA by inducing the degradation of the 
mRNA in a sequence specific manner (Ambros V., Cell; 1 13(6):673-676 (2003)), This 
is in good agreement with the recent finding that double stranded RNA and syn- 
thetic siRNA can also induce cognate mRNA degradation in a wide range of organ- 

30 isms (McManus MT, Sharp PA,, Nature Rev Genet; 3(10):737-747 (2002)). Long 
double stranded RNA was found to induce intensive non-specific inhibition of RNA 
synthesis in mammalian cells, but siRNA can bypass this obstacle and still main- 
tain the strong inhibitory effect on target gene which shares sequence identity with 
the siRNA (Elbashir SM et sL, Nature; 411(6836):494-498 {2001}). This has made 

35 siRNA a primary tool for gene knockdown in functional genomics. SiRNA also has 
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the potential to become drugs that can be used to cure a disease by reducing the 
activity of disease related gene. 

SiRNA are generally double stranded RNA of 19-25 base pairs that are either formed 
5 by a single RNA molecule in the form of hairpin or formed by two separate RNA 
molecules, with different 3'~overhangs. SiRNA can be produced in three ways: 
chemical synthesis; expression from DNA vectors under the drive of a promoter; 
and RNase III {Dicer} cleavage of long double stranded RNA. All siRNA that have 
been used so far are designed to target a segment of a predefined gene. 

10 

SUMMARY OP INVENTION 

The present invention relates to DNA libraries, each of which contains all possible 
permutations (permutation refers to different sequences) of double- stranded RNA of 

15 certain length. Such DNA libraries can be easily configured to produce all permuta- 
tions of siRNA. It provides a high throughput screening method for double stranded 
RNA (as well as siRNA) in a target-independent manner for indications related to 
any given phenotype. More specifically, the siRNA encoded by such libraries can be 
used in such screening either individually, or as a mixture of any complexity, with- 

20 out the burden of knowing its sequence or its target gene. This method can over- 
come two major obstacles in siRNA application: 1) the incomplete knowledge about 
the transcriptome of each organism. According to the recent data from mouse tran- 
scriptome analysis, our knowledge about the transcriptome of this best understood 
model animal is still far from complete. Much less is known about the transcrip- 

25 tome of human and other animals. Since the application of oar library does not re- 
quire any prior information about the target, sequence, it will allow immediate im- 
plementation of genome-wide siRNA screening in any orgainisms. 2) the extraordi- 
narily high cost of siRNA. No matter how the siRNA is prepared, the cost of making 
siRNA targeting all known mRNA of an organism is extremely high. A single regen- 

30 erate-able DNA library that contains all permutation of siRNA that can be applied in 
any organisms virtually reduces the cost of siRNA production to a minimum level. 

Accordingly, in one aspect the present invention relates to a DNA library for the 
production of a library of double stranded RNA molecules of a predefined length in 
35 the range of 10-30 base pairs in living cells, wherein the sequence (s) of the DNA re- 
gion (or regions) encoding the double stranded part of double stranded RNA mole- 
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cule(s) is randomized in a number selected from 4 to all nucleotide positions, and 
wherein both, strands of said dotxble stranded RNA molecule is produced from a 
single member of the DNA library. The invention also provides a kit containing the 
DNA library. 

5 

In another aspect the present invention provides a method of preparing the DNA 
library. 

In yet another aspect the invention relates to an RNA library obtained from the DNA 
10 library. 

Further aspects and advantages of the invention will become evident hereinafter 
from the following detailed description and attached claims. 

1 5 DESCRIPTION OF THE DRAWINGS 

Figure 1 shows an example of construction of DNA library that can encode all per- 
mutations of double stranded RNA of a certain length. Example 1, a DNA library 
that can encode all double stranded RNA with 19 base pair duplex region and 3' 

20 poly U over hangs, in Figure 1A, the cloning strategy is shown. In Figure IB, ex- 
perimental verification of the quality of the library is demonstrated. As shown in the 
agarose gel, single clone fix), and pools of 10 clones {XGx}, atid pools of 30 clones 
give rise to the a single expected band after enzyme cleavage, suggesting that most 
clones in the library contain the expected insert. The same procedure can be used 

25 to produce such DNA libraries encoding different length (10-30 base pair) of double 
stranded RNA, as well as such DNA libraries with only part of the DNA sequence (4- 
30 lit} randomized. 

Figure 2 shows the construction of a plasmid to verify that the presence of two 
30 promoters and two terminators in opposite sides of the RNA coding region can af- 
ford efficient down-regulation of the expression of the target gene. With all scientific 
knowledge available as of today, such an efficient down regulation can only be 
achieved by the efficient production of double stranded RNA from the plasmid. Thus 
it is concluded that this plasmid can efficiently produced double stranded RNA in 
35 living cells. A shows the cloning strategy. B shows the gel analysis verified that the 
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designed fragment is inserted into the plasnnU C illustrates cell assay verified that 
the resulting plasmid induces efficient inhibition, of target gene ReniEa luciferase. 

Figure 3 shows an example of an alternative method of generating DNA libraries 
5 that encode all permutations of double stranded RNA of a given length, In Figure A, 
the cloning strategy is shown. In Figure B, sequences of the different segments in A 
with key restriction sites underlined are shown. The same procedure can be used to 
produce such DNA libraries encoding different length (10-30 base pair} of double 
stranded RNA, as well as such DNA libraries with only part of the DNA sequence (4- 
10 30 ntj randomized. 

Figure 4 shows another alternative method of generating DNA libraries that encode 
all permutations of double stranded RNA of a given length. A illustrates the cloning 
strategy. B illustrates sequences of the different segments in A with key restriction 
IS sites underlined. The same procedure can be used to produce such DNA libraries 
encoding different length (10-30 base pair) of double stranded RNA, as well a.s such 
DNA libraries with only part of the DNA sequence (4-30 nt) randomized. 

DETAILED DESCRlPilON OF THE INVENTION 

20 

Small interference RNA {siRNA) is a term initially used to define short double 
stranded RNA that have a 19-21 nt double-stranded region nested between 3'-UU or 
TT or other single stranded overhangs. A number of variations of dais original form 
of siRNA (such as hairpin-type) have been introduced lately. Such siRNA can be 

25 used to reduce the expression of genes having identical sequence to the siRNA dou- 
ble stranded region in cells from a variety of different organisms. While longer dou - 
ble stranded DNA and RNA also could be produced by means of the methods of the 
invention, the libraries of the invention have been restricted to double stranded 
DNA and RNA of a length of 10-30 base pairs, since above the length of 30 base 

30 pairs, the nucleotides will be more likely to produce an immunoresponse , and other 
disturbing side-effects when transfected into living cells. 

SiRNA are initially chemically synthesized, but several methods have been intro- 
duced to generate siRNA enzymatically, using viral promoters such as T7 promoter, 
35 or microRNA promoter such as HI or U6, in free form or in plasmid or viral vectors. 
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The current invention provides a method to construct DNA libraries encoding ran- 
dom siRNA libraries. Such a library differs from the prior art in that in the prior art, 
one would have to design the siRNA according to a known sequence of the gene, 
whereas from the present Ebrary one can screen through a fully random panel of 
5 different siRNA (without the need of prior knowledge of their sequences or their tar- 
get sequences) to look for phenotypes associated with each siRNA, and then identify 
the genes related to each siRNA de novo. 

Construction of DNA libraries containing a single randomized region 

10 The challenge of making a fully randomized DNA library based on plasmids or viral 
vectors encoding all permutations of siRNA is to make sure that each member of the 
DNA library expresses a distinct and complete double stranded RNA, None of the 
existing methods of making vector-based siRNA (short double stranded RNA) can 
meet tins challenge. 

15 

The current invention describes the construction of a random DNA library with only 
one randomized region. Then for each piasmld, two promoters will drive the tran- 
scription of this region from the opposite direction to produce the two complemen- 
tary RNA strands separately, Two transcription terminators were placed at each end 

20 of the randomized region to make sure that RNA of a defined length can be pro- 
duced from each direction. The advantage of this approach is to avoid the trouble- 
some cloning procedure in the dual-region system as will be described beneath for 
creating two reverse complementing regions in each individual plasmid. One exam- 
ple of the promoters that can be used in such a system is the RNA polymerase III 

25 promoters HI or U6. For RNA polymerase III, a stretch of TTTTT is needed for the 
proper termination of the transcription. In order to use this RNA polymerase to 
drive expression of the same region from both directions, the TTTTT stretch has to 
be inserted on the both ends of the randomised region. There is one problem though; 
the RNA polymerase III promoters has to be placed immediately next to the random- 

30 ized region to ensure proper transcription start from the precise location of the be- 
ginning of the randomized region, but those promoters does not contain a AAAAA 
stretch that would allow the TTTTT terminator to appear on the opposite direction. 
The only way this can be done is to mutate the RNA polymerase III. promoters to in- 
sert such a AAAAA stretch, and nobody knows how the insertion of the AAAAA 

35 stretch will affect the transcription starting, and the rate of transcription. As will be 
shown below, we mutated the Hi RNA polymerase III promoter and inserted an 
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AAAAA stretch at the end of the promoter and verified that the mutated promoter 
support proper transcription start and product of effective siRNA, Tims, we first 
started to construct a pJasmid Kbraty with the termination signal placed on both 
sides of the randomized region (Figure 1). 

Construction of the vector with dual-Hl promoter against renilla luciferase 

A plasmid with two mutated RNA polymerase III promoters, each embedding one 
transcription terminator sequence for the other promoter, was constructed with the 
siRNA region designed to target a model molecule Renilla luciferase {Figure 2}, The 
key finding that such a plasmid can support the successful production of effective 
siRNA duplex from a single target sequence of 19 bp forms the basis of constructing 
fully randomized siRNA library that have only one randomized region (Figure 2). 

Mutation of the H 1 RNA polymerase ill promoters and construction of the example 
plasmid is described in details below. 

1 . Delete 3 nucleotides immediately upstream of Bgl II site in pBluescript II KS-Hl 

vector (Brummelkamp TR et ah, Science, 296(5567): 550-553(2002)} 
PGR amplify the fragment between EcoR I-Bgl H{H1 promoter) of the original vector, 
with the following primers: 

5' primer; GGAATTCGAACGCTQACOTCATCAACCCG 

3' primer: GAAGATCTGTCTCATACAGAACTTATAAGATTCCC 

{matation one: three {3} nucleotides just upstream of Bgl II site was 
deleted in order for transcription to start from proper position after the in- 
sertion of the AAAAA sequence according to described beneath) 
Clone the PGR product in between EcoR I-Bgl II, into the original pBluescript II KS- 
Hl (Brummelkamp TR et al. cited above) vector, verify the plasmid DNA by sequenc- 
ing: 

The modified sequence: 

001 TCCAOONANC GCGGGCGCAG TGTCACTAGG CGGGAACACC CAGCGCGCGT 
QS1 GCGCCCTGGC AGGAAGATGG CTGTGAGGGA CAGGGGAGTG GCGCCCTGCA 
101 ATATTTGCAT GTCGCTATGT GTTCTGGGAA ATGACCATAA ACGTGAAATG 
151 TCTTTGGATT TGGGAATCTT ATAAGTTCTG TATGAGACAG ATCTTCAATA 
20 1 TTGGCCATTA GCCATATTAT TCATTGGTTA TATAGCATAA ATCAATATTG 
251 GCTATTGGCC ATTGCATACG TTGTATCTAT ATCATAATAT GTACATTTAT 
301 ATTGGCTCAT GTCCAATATG ACCGCCATGT TGGCATTGAT TATTGACTAG 
351 1TATTAATAG TAATCAATTA CGGGGTCATT AGTTCATAGC CCATTATGGG 
401 AGTTCCGCGT TACATAACTT ACGGTAAATG GCCCGCCTGG CTGACCGCCC 
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451 AACGACCCCC GCCCATTGAC QTCAATAATG ACGTATG1TC CCATAGTAAC 

2. Construction of the vector with mutated duai-Hl promoters (here below referred 
to as pDH, stands for plasmid with Dual HI promoters) 

5 PGR amplify the fragment between EeoR I -Bgl II of the above modified vector, with 
the following primers: 

5' primer: ACGCGTCGACGAATTCGAACGCTGACGTCATCAACCCG 
3' primer; CCCAAGCTTGTCTCATACAGAACTTATAAGATTCCC 
Clone the above PCR product in between Sal I- Hind III, in a reversed orientation, 
10 into the above modified vector, verify the plasmid DNA by Bgl H+Sal I digestion, the 
correct clone should have a fragment of -IQOOhp. Results showed that all die ten 
clones checked were correct ones (Please note: pDH actually contains two truncated 
HI promoter, this is due to the need to subsequent cloning process. The missing 
part of the promoter will be made up during the subsequent cloning process.) 

15 

3. Put the Renilla luciferase target sequence into pDH to form pDHRL: A se- 
quence corresponding to nt 82 -100 of Renilla luciferase mRNA was used as the test 
DNA. siRNA targeting this site of the Renilla luciferase was known to be active 
(Brummeikamp TR et at. cited above}. Two oligo DNA were synthesized and an- 

20 nealed to each other to make the double-stranded DNA: 

5' GGGGAAGATCTAAAAAAATAAATGAATCAAGAACA1T^1TTAAGCT1 , 'GQGG 
5' CCCCAAGCITAAAAATGTTCTTGATT^ 

25 The above double stranded DNA was cleaved with Bgl II-Hind III and cloned in be- 
tween Bgl II-Hind Id sites in pDH. Verification of the correct insertion of the DNA 
fragment into the plasmid DNA was done by cleavage by Bg! II+Sal I digestion, 
where the correct clone should give rise to a ~250bp fragment. All three clones 
tested showed to have the correct insert (Figure 2) 

30 

Efficient inhibition of luciferase expression by pDHRL. Take the above three clones: 
clone 1, clone 2 and clone 3, and transfect plasmid into HEK293 cells on 24- well 
plate, at 1.2ug, 0.6ug respectively, together with plasmid of Renilla luciferase and 
firefly luciferase encoding plasmids. 48 hours later, measure the Renilla and Firefly 
35 Luciferase activity. (Figure 2C). The results suggested that with the mutated pro- 
moters the plasmid can induce very efficient inhibition of the expression of target 
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gene Renilla luciferase, which indicated efficient production of siRNA from the mu- 
tated H 1 promoters in the dual promoter/ dual terminator plasmid constructed in 
the current invention. Specifically the result suggested that with the mutated H 1 
promoter, the RNA transcription driven by RNA polymerase HI can be properly initi- 
5 ated and terminated, to result in the efficient production of duplex RNA of proper 
length that can induce significant RNA interference and inhibition of gene expres- 
sion. 

Cloning the randomized DNA into pDH to form a library that encodes all per- 
1 0 mutations of the siRNA 

The construction of randomized DNA library that encodes all permutations of siRNA 
is done in a similar way as the construction of the anti-luciferase siRNA encoding 
plasmid in pDHRL, with the only difference that the second strand of the tester se- 
quence was generated enzymatically to preserve the randomised nature of the se- 
15 quence. 

Three oligonucleo tides were synthesized with .1.9, 20 and 21 at of randomized region 
embedded within the two known sequences. 
19-mer randomized region 
20 GGGGAAGATCTAAAAA NNNNNNNNNNNNNNNNNNN TiTTTAAGCTTGGGG 

20~mer randomised region 

GGGGAAGATCTAAAAA NNNNNNNNNNNNNNNNNNNN TTTTTAAGCTTG GGG 
2 l~mer randomised region 

GGGGAAGATCTAAAAA NNNNNNNNNNNNNNNNNNNNN TTTTTAAGCTTGGGG 

25 

The oligonucleotides were allowed to anneal to a primer CCCCAAGCTTAAAAA and 
filled in with Klenow fragment in the presence of 1 mM concentration of dNTP in 
proper buffer (all chemicals other than DNA oligonucleotides were purchased from 
New England Biolabs Inc. unless otherwise specified). The duplex oligos were 
30 cleaved with Bgl IhHind III and cloned in the Bgl II-Hind III sites of the pDH to form 
pDH-libraryA. 

The quality of the pDH-libraryA was assessed by first clone length analysis of 41 
clones, where single clone, a 10-clone pool and a 30-elone pool was used to pre- 
35 pared plasmid DNA and cleavage with restriction enzyme. The results suggested 
that all clones have the insert of the same length (figure IB). The ten clones were 
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individually prepared and sequenced. All sequenced clones contain, the expected 19 
base pair insert as expected. Their sequences showed expected randomness as well 
(see below), 

AAAGGGTTTACGTGGTTGG 

AATCGTCTTATTTGCATGC 

AATTGACATGTGAGCTTGG 

AGTAGCTTGTTGAGGTTGG 

CAGCATCACTGTATGTGTC 

CTATCTTCGTGGAGGTTGG 

CTATGAAGGTGGTGATGCG 

C1TAATTGGTGGTTGTAGG 

TGGCTGTATGTGAGTGGCT 

TTAATCTCTGGTGTCCTAA 

TTGTAGGGACTTGGATGAT 

One alternative to plasmid vectors for epitopic expression of foreign gene is various 
types of viral vectors. Since all cloning strategies for constructing viral vectors are 
common knowledge, and anybody with reasonable knowledge of the art can produce 
viral constructs that can carry out similar expression functions as the plasmids, the 
disclosure of making DNA libraries as above will also enable the production of DNA 
libraries as such in viral vectors. 



Construction of DNA libraries containing a pair of randomized regions with in- 
verted sequences 

25 Although the vectors with two promoters and two terminators aa represented by 
pDHRL and pDH-libraryA are the preferred modes of the current invention, other 
methods of forming DNA libraries that encode all permutations of siRNA become 
obvious once the concept of DNA library encoding all permutations of siRNA is dis- 
closed here. One such method is to form a plasmid library that encodes all permu- 

30 tations of the hairpin form of the siRNA. As an example, such a library can be 
formed according to the following procedure. 

1. Library oligonucleotide was synthesized to contain a fully randomized region 
of 19 nt randomized sequence nested between two predetermined sequences 
35 {PI and P2) with 5' end phosphorylated, A hairpin forming oligonucleotide 

was synthesized to contain 5* phosphorylation and a 3' protruding stretch 
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with complementary sequence of the PI region. The Library oligonucleotide 
and the hairpin DNA were annealed and Hgated with T4 DNA ligase and then 
filled in with Klenow fragment (Figure 3) 
2. The extension mixture* after purification., was cleaved with BamH I and 
5 hgated into an doable stranded adopter that has cohesive ends on one end 

and a 3' protruding stretch as a site for further priming (P3) After ligation the 
DNA are size-selected so that only full length fragments that contain library 
oligonucleotides and the hairpin oligonucleotide, as well as the adopter linker 
are collected, 

10 3. Purified full length fragments are allowed to anneal to the primer 3 {which is 
complementary to the P3 priming site), and a strand-displacing DNA poly- 
merase Phage29 DNA polymerase is used to drive the synthesis of a DNA 
fragment ALPHA. Each DNA fragment ALPHA contains: a fully double 
stranded adaptor linker on each end of its sequence, two identical copies of a 

15 randomized sequence arranged in reverse orientation, arid the two copies are 

linked by the linearized sequence of the hairpin linker in its double stranded 
form, 

4. DNA fragment ALPHA can then be cleaved in proper sites in its adopter 
linker region and then Hgated to a plasmid for further manipulation (Plasmid 

20 alpha). 

5. Plasmid alpha is first cleaved with Sam 1 and Bpm I and the filled in with 
Klenow fragment and Hgated. The resulting plasmid is propagated in E coli 
and then the insert is cleaved with Beg I to remove extra sequences between 
the two randomized region and leave a 9-nt stretch (TTCAAGAGA) to form the 

25 loop in the future siRNA hairpin {figure 13). 

6. Afterwards, the insert can be all cleaved from the plasmid with Hind III and 
Bgl II and inserted into the pBluescript-Hl vector to form a library. This li- 
brary encodes all siRNA permutations in a hairpin form. In this case, the 
plasmid only need to have one promoter and one terminator for the formation 

30 of hairpin RNA within the cells. 



Slight modification of the above cloning protocol as illustrated in Figure 4 can result 
in DNA libraries that have two wild type HI promoter and two transcription 
terminators, wherein each member of the library encode the two separate strands of 
35 a double stranded RNA, This involves the insertion of a second promoter and TTTTT 
terminator between the two inverted randomized region of the DNA library as 
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illustrated in Figure 4. With the detailed disclosure described above and in figure 1- 
3, this alternative is obvious to a person skilled in the field. 

It has to be stressed that due the enzymatic handling of the library, all siRNA that 
contain the restriction enzyme sites are lost This will result in about 0.02S % 
SiRNA loss each restriction enzyme used. So in this sense the preferred mode of the 
current invention, based on two promoters and two terminators, will suffer less 
siRNA loss and be a more complete Ebrary, than the library generated according to 
the above hiarpin library protocol due to the number of enzymes used in the 
individual protocols. Since the library contains about 2.75 x 10" permuations in 
theory, the loss of siRNA species caused by the use of restriction enzymes will only 
have neglectable effect on the quality of library and for the screening of active siRN A 
against any specific gene. In the text of this invention, the referral to "all 
permutations of siRNA" should be understood as having this effect considered and 
included. Further eleminauon of this effect will be done by eliminating the use of 
restriction enzymes in the construction of the libraries. 



Another note is that the sequences and restriction enzymes are only one set of ex- 
amples that can be used to carry out the construction of the plasmid. The person 
20 skilled in the art can easily choose different restriction enzymes and corresponding 
sequences of the oligonucleotides to carry out the construction in similar manner in 
plasmids and viral vectors, according to the principle disclosed as above. 

Generation of DNA libraries that encode cell-specific, tissue-specific or species 
specific double stranded RNA 

With the disclosure of the random DNA libraries encoding all permutations of dou- 
ble stranded RNA of a given length, the method of establishing DNA libraries that 
encode cell-specific, tissue-specific or species specific double stranded RNA should 
be considered to be obvious to a person skilled in the field. One example of con- 
structing such DNA libraries is presented below. 

An oligonucleotide with 19 nt of randomized region is allowed to hybridize to mRNA 
purified from a specific cell type. The mRNA can be immobilized onto a streptavidin 
coated solid support {plastic beads for example) via biotm added to the end of the 
35 mRNA with Poly (A) polymerase. Irnmobilization of mRNA can be done in other ways 
too. After hybridization, all unbound DNA oligonucleotides are washed away and 



25 



30 
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the bound DNA sub-random oligonucleotides are collected and cloned into the vec- 
tor in a protocol identical to protocols described for fully randomized DNA oligonu- 
cleotides. The libraries resulted from this process will be highly enriched for mole- 
cules that encode double stranded RNA with sequence identical to the mRNA 
5 sources. 

It should be noted that although all cloning procedures herein are described in the 
context of a single plasmid vector, the principle should be applicable to all types of 
plasmids, and the cassette containing the mutated promoters, terminators and the 
10 coding region of the DNA libraries can be transferred between those different types 
of plasmids . 

It should be further noted that although all cloning procedures are described in the 
context of a single type of promoter, HI promoter, the principle should be applica- 
ble to all types of RNA polymerase III type of promoters. 

15 

One alternative to plasmid vector for epitopic expression of foreign gene is various 
types of viral vectors. Since all cloning strategies for constructing viral vectors are 
common knowledge, and anybody with reasonable knowledge of the art. can produce 
viral constructs that can carry out similar expression functions as the plasmids, the 
20 disclosure of making DNA libraries as above should also enable the production of 
DNA libraries as such in viral vectors, 

SUMMARY 

The current invention involves DNA libraries that can generate double stranded 
25 RNA of 10 ~ 30 base pair in length, with at least one strand of the double stranded 
RNA having single stranded overhangs, and further involves methods to produce 
such DNA libraries. It is acknowledged that most frequently used double stranded 
RNA is siRNA of 19-21 base pair in length, normally with TT or UU overhangs on at 
least one of the strands. So the advantage of the current invention is discussed in 
30 comparison to siRNA generated by other methods. 

In practice, only one in three to five or so short double stranded RNA that fulfill the 
basic structural requirement (19-21 -base pair double stranded region, 3' single 
stranded overhangs (normally TT, or UU, but not limited to such overhangs). For 
35 knocking down the 30,000 human genes using siRNA, about 90,000 -150,000 
siRNA then will have to be synthesized, at the cost of 18-30 million US dollars. 
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Similar amount of cost has to be allocated to any additional organism for which the 
full spectrum of siRNA will be generated for all genes 

The current invention can generate a siRNA library encoded in plasmids that con- 
5 tains in theory all the permutations (4» 9 =2.75 x 10") of siRNA (19 base pair du- 
plexes plus overhangs} (the size of libraries for double stranded RNA of other length 
can be easily calculated hi similar way), that can be used in any organisms for 
which the a proper promoter (s) can be found). The cost of generating this library is 
just a minimal fraction of the cost of synthesizing all siRNA chemically. In other 
10 words, this is a Ebrary with the complexity of 2.75 x 10 11 that contains reagents 
that can silence any gene in a mammalian and non-mammalian system. This is a 
very powerful toolbox for high throughput genome wide functional genomics and 
drug target screening, as well as nucleic acid drug development. 

15 The complexity of this library can be further reduced dramatically by introducing a 
one-step oligoselection on the library oligonucleotides. Such an approach will lead 
to the creation of gene-, ceE/ussue-, or organism-specific siRNA encoding library 
that has much lower complexity (10MG 8 }, without sacrificing the usefulness of the 
library. Such a low complexity library can be partially or completely sequenced us- 

20 ing different sequencing methods and enable the creation of plasmid collections 
that contains known siRNA encoders for each gene in an organism such as human, 
mouse or rat. 

The description of the above is most based on plasmid system but the same library 
25 and collection can be easily established in viral vector using the same principle. 

A few key classes of application of the invention is listed here as examples 

1} A full collection of siRNA encoding plasmids can be selected for any given 
gene from this library tlirough standard screening (which could be auto- 
30 mated). 

2) A full collection of siRNA encoding plasmids can be selected for any given cell 
type, tissue and organism can be established according to the invention. 

3) Such collections of siRNA encoding plasmids can then be easily evaluated for 
their individual capacity to knockdown gene expression, 

35 4) Most powerfully, such DNA libraries can be used for phenotype- based 
screening of target genes without prior knowledge of the target sequence or 
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the siRNA sequences , thus the artisan can avoid the biased pre-selection of 
target genes. This will become one most significant ^vay of functional annota- 
tion and drug target screening. 
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1. A DMA library for the production of a library of double stranded RNA molecules of 
a. predefined length in the range of 10-30 base pairs in living cells, wherein the se~ 
5 quence(s) of the DNA region {or regions) encoding the double stranded part of dou- 
ble stranded RNA moleeule(s) is randomized in all nucleotide positions, and wherein 
both strands of said double stranded RNA molecule is produced from a single mem- 
ber of the DNA library. 

10 2. A DNA library for the production of a library of double stranded RNA molecules of 
a predefined length in die range of 19-30 base pairs in living cells, wherein the se- 
quence(s) of the DNA region (or regions} encoding the double stranded part of dou- 
ble stranded RNA molecule(s) is randomized in at least 19 nucleotide positions, and 
wherein both strands of said double stranded RNA molecule is produced from a 

15 single member of the DNA library. 

3. A DNA library for the production of a library of double stranded RNA molecules of 
a predefined length in the range of 15-30 base pairs in living cells, wherein the se- 
quence's) of the DNA region (or regions) encoding the double stranded part of dou- 

20 fole stranded RNA molecule(s) is randomized in at least 15 nucleotide positions, and 
wherein both strands of said double stranded RNA molecule is produced from a 
single member of the DNA library. 

4. A DNA library for the production of a library of double stranded RNA molecules of 
25 a predefined length in the range of 10-30 base pairs in living cells, wherein the se- 
quence^) of the DNA region (or regions) encoding the double stranded part of dou- 
ble stranded RNA raoleeule(s) is randomized in at least 4, 7 or 10 nucleotide posi- 
tions, and wherein both strands of said double stranded RNA molecule is produced 
from a single member of the DNA library. 

30 

5. A DNA library for the production of a library of double stranded RNA molecules of 
a predefined length in the range of 10-30 base pairs in living cells, wherein the se- 
quence^) of the DNA region (or regions) encoding the double stranded part of dou- 
ble stranded RNA molecule(s) is randomized in 4 to aH nucleotide positions, and 

35 wherein both strands of said double stranded RNA molecule is produced from a 
single member of the DNA library. 
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6. A DNA library of any of the claims 1 to 5, wherein said double stranded RNA 
molecules also contain single stranded region(s) at one end or both ends of the 
molecules. 

5 

7. The DNA library of any of the claims 1 to 6, wherein each member of the DNA li- 
brary contains one promoter for transcription of the double stranded RNA mole- 
cules and one terminator for transcription of the double stranded RNA molecules, 
and wherein the double stranded RNA is formed as a hairpin type double stranded 

10 molecule 

8. The DNA library any of the claims 1 to 6, wherein each member of the DNA li- 
brary contains at least two promoters for transcription of the components of the 
double stranded RNA molecules and two terminators for transcription of the com- 

15 portents of the double stranded RNA molecules, and wherein the double stranded 
RNA is formed by two separate RNA molecules that are complementary to each 
other in the double stranded region, 

9 , The DNA library of claims 1 to 8, wherein the DNA library is constructed within a 
20 plasmid vector. 

10, The DNA library of claims 1 to 8, wherein the DNA library is constructed within 
a viral vector, 

25 11. A DNA library of claims 1 to 10 wherein the randomness of the library was 
modified by selection of the random DNA oligonucleotides, before cloning the said 
random DNA oligonucleotides into the vectors, through hybridization to a total RNA 
prepar ation or total mRNA preparation from a source, whereby only the oligonucleo- 
tides hybridized to the source RNA (or mRNA} are subsequently cloned into the vec- 

30 tor, and wherein the source can be a cell, a cell line, a tissue, or a organism. 

12, A kit containing the DNA library of any of the claims 1 to 11. 

13. A method of constructing a DNA library of any of the claims 1 to 6, and 8 to 11 
35 wherein a pair of mutated HI promoters are placed in opposite directions to drive 

the RNA expression from the DNA fragment inserted between the two promoters, 
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wherein the said mutated HI promoter differs from the wild type HI promoter in at 
least the sequence of the 5-nueleoiide region immediately ahead of the transcription 
starting site. The said 5-nucelotide region of the mutated HI promoter is AAAAA. 

5 14. An RNA library obtained from the DNA library of any of the claims 1-12, 
wherein the length of double stranded RNA produced is hi the range of 10 to 30 nu- 
cleotides. 

15. A method of using the DNA libraries of any of the claims 1 to 12, wherein the 
10 library is transien&y or permanently introduced into cells as a mixture. 

16. A method of using the DNA library of claims 1 to 12 to screen for double 
stranded RNA with biological functions. 

15 17. A method of using the DNA library of claims 1 to 12 to screen for novel genes. 

IS. A novel gene obtained by the methods of any of the claims 15 to 17. 

19. A novel function of a gene obtained by methods of any of the claims 15 to 17. 

20 

20. A pharmaceutical composition obtainable by the methods of any of the claims 15 
to 17. 
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Figure 1. 
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Figure 2. 
A. 
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Figure 2 (contd) 
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Figure 3. 
A, 
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Hairpin: CAC ACG TGT CTT CGA ACA CAA TGC TAA TCT CTT GAA 
P3: AGC TTA CTG CAC CC GGG GAT CCT GTT 
Primer: AAC TGG ATC CCC GGG GTG CAG 



WO 2004/00104.1 IMM 2<M 1/00107 



Figure 4. 
A. 
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P3: AGC TTA CTG CAC CC GGG GAT CCT QTT 
Primer: AAC TGG ATC C CC GGG GTG CAG 
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Present claims 18-20 relate to a gene, a novel function of a gene 
or a pharmaceutical composition defined by being obtained by the 
method according to claims 15-17. The claims cover all genes, 
functions and compositions having this characteristic, whereas 
the application provides support within the meaning of Article 6 
PCT and disclosure within the meaning of Article 5 PCX for no 
such genes, functions or compositions. Additionally, previously 
known 'genes, functions or compositions may be included in the 
scope of the present claims. In the present case, the claims so 
lack support, and the application so lacks disclosure, that a 
meaningful search over the whole of the claimed scope is 
impossible, independent of the above reasoning, the claims also 
lack clarity (Article 6 PCT) . An attempt is made to define the 
genes, functions and compositions by reference to a result to be 
achieved. Again, this lack of clarity in. the present case is such 
as to render a meaningful search over the whole of the claimed 
scope impossible. Therefore, no search has been performed for 
claims 18-20. 

The applicant's attention is drawn to the fact that claims, or 
parts of claims, relating to inventions .in respect of which no 
international search .report has been established will not foe the 
subject of an international preliminary examination (Rule 66.1(e) 
PCT) . This is the case irrespective of whether or not the claims 
are amended following receipt of the search report or during any 
Chapter II procedure. 
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